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Abstract 

Background: Rice blast fungus Magnaporthe oryzae is one of the most devastating pathogens in rice. Avirulence 
genes in this fungus share a gene-for-gene relationship with the resistance genes in its host rice. Although numerous 
studies have shown that rice blast R-genes are extremely diverse and evolve rapidly in their host populations, little is 
known about the evolutionary patterns of the Avr-genes in the pathogens. 

Results: Here, six well-characterized Avr-genes and seven randomly selected non-Avr control genes were used to 
investigate the genetic variations in 62 rice blast strains from different parts of China. Frequent presence/absence 
polymorphisms, high levels of nucleotide variation (~1 0-fold higher than non-Avr genes), high non-synonymous to 
synonymous substitution ratios, and frequent shared non-synonymous substitution were observed in the Avr-genes of 
these diversified blast strains. In addition, most Avr-genes are closely associated with diverse repeated sequences, which 
may partially explain the frequent presence/absence polymorphisms in Avr-genes. 

Conclusion: The frequent deletion and gain of Avr-genes and rapid non-synonymous variations might be the primary 
mechanisms underlying rapid adaptive evolution of pathogens toward virulence to their host plants, and these features 
can be used as the indicators for identifying additional Avr-genes. The high number of nucleotide polymorphisms 
among Avr-gene alleles could also be used to distinguish genetic groups among different strains. 
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Background 

Rice blast, caused by infection with the ascomycete 
fungus Magnaporthe oryzae, is one of the most devasta- 
ting known rice diseases. It threatens the stability of rice 
production worldwide [1]. One of the most effective and 
economically viable ways of controlling this disease is 
the utilization of appropriate resistant cultivars that have 
major resistance genes against the rice blast [2]. So far, 
85 blast R genes have been identified in rice, but the fun- 
gus M, oryzae is highly variable and often can overcome 
resistant cultivars within a few years of their initial de- 
ployment in the field [3]. The gene-for-gene hypothesis 
shows that a single plant 7?-gene product recognizes a 
unique avirulence protein [4]. A large number of studies 
have provided evidence that some Avr proteins can be 
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recognized by plant R proteins. A strong and rapid im- 
mune response follows this recognition. This prevents 
further invasion [5-7]. In this way, it has been hypo- 
thesized that the ability to defeat rice 7?-genes may be 
due to the instability of Avr-genes in M. oryzae, inclu- 
ding deletion and translocation of the genes, the inser- 
tion of transposons into the gene or promoter sequence, 
and point mutations [8-11]. 

Currently, 85 major rice blast genes have been gene- 
tically characterized and 19 of them have been cloned 
[6]. Frequent copy number variation, high levels of 
nucleotide diversity, and high non-synonymous to syno- 
nymous substitution ratios (high Ka/Ks, indicating posi- 
tive selection) were observed at these 7?-gene loci, 
suggesting the rapid allelic diversification of these 
7?-genes as an adaptive response to the rapidly changing 
spectrum of rice blast strains [12-14]. In rice blast 
pathogens, nine Avr-genes, all of which encode proteins 
of unknown function, have been cloned. These include 
Avr-Pita, Avr-C039, PWL1, PWL2, ACE1, Avr-Pizt, 
Avr-Pia, Avr-Pii, and Avr-Pik/km/kp [5,9,10,15-17], 
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The direct interaction between Pita and Avr-Pita is the 
first binding effect observed between a plant R and a fungal 
Avr protein [18]. This Avr-gene is highly diversified and it 
has been confirmed that it is capable of rapid changes in 
nature, including complete or partial deletions, transposon 
insertions, point mutations, and even translocations [1]. In 
PWL2, a single nucleotide alteration leading to amino acid 
replacement has been shown to be crucial to its virulence: 
the protein of the deficient type PWL2D is more disordered 
and degenerative and cannot induce an immune response 
in the host plant [19]. ACE1 is an Avr gene distinct from 
others in that it encodes an enzymatic protein that is 
involved in synthesis of a secondary metabolite and an al- 
lele with an insertion mutation that destroys its enzymatic 
function has also been found to deprive it of avirulence [9]. 
Among the recent cloned gene of Avr-Pil</km/kp showed 
frequent presence/absence polymorphisms and an excess of 
nonsynonymous substitutions [20]. Complete and effective 
alleles of PWL1 and Avr-C039 are only present in Magna- 
porthe strains which cannot infect rice cultivars [8,16]. 

Because of their significant role in pathogen-host interac- 
tions, a total of four strains of M. oryzae, 70-15, Inal68, 
Y34, and P131, have been whole-genome sequenced or re- 
sequenced and made available to the public [20-22]. The 
first whole-genome sequenced strain 70-15 was generated 
by backcrossing progeny from a cross between a rice isolate 
and a weeping love grass isolate with the rice isolate Guy 11 
from France. The product of this backcrossing showed re- 
duced female fertility and virulence [22]. EcoTilling experi- 
ments were performed to identify Avr-genes among the 
predicted secreted protein genes from the genome of 
70-15. Results showed that only 22% of the 1032 pex genes 
contained nucleotide substitution in 46 diverse Oryza field 
isolates, but presence/absence polymorphisms might be 
more common in Avr-genes. For example, Avr-Pia, Avr-Pii, 
and Avr-Pik/km/kp are absent from the lab strain 70-15 
and present in strain Inal68 [20]. The results of this study 
suggested that presence/absence polymorphisms might be 
the major mechanism underlying the evolution of rice blast 
Avr-genes. In this way, resequencing and association genet- 
ics might be a useful strategy for cloning additional Avr- 
genes [20]. 

Understanding the composition and dynamics of Avr- 
genes in natural populations from diverse rice-growing 
regions may be useful to breeders attempting to cultivate 
resistant varieties. Six well characterized Avr-genes and 
seven randomly selected control genes were used to investi- 
gate genetic variations in a single laboratory strain from 
Japan and 61 field strains from different parts of China. 

Results 

Presence/absence (P/A) polymorphisms in Avr genes 

Thus far, no more than 10 Avr genes have been cloned 
from M. oryzae. In the present study, six well characterized 



Avr genes, including Avr-Pita 1, PWL2, Avr-Pik/km/kp, 
Avr-Pii, Avr-Pia, and ACE1, and seven randomly selected 
non-Avr genes, which served as control genes (Table 1), 
were used to investigate the genetic variations in 62 rice 
blast strains, including 1 laboratory strain from Japan and 

61 field strains from different parts of China (Additional 
file 1: Table SI). In each control gene, as expected, PCR- 
amplified products and their sequences were obtained 
from all of these 62 blast strains (Table 1). However, the 
PCR-amplified products and sequences were obtained in 
only some strains even after repeated efforts with different 
combinations of primers (Additional file 1: Table S2). This 
suggested that presence/absence polymorphisms might be 
prevalent at these Avr loci in these strains. 

Avr-Pik/km/kp, Avr-Pii, and Avr-Pia effector genes were 
first recognized by their P/A polymorphisms between la- 
boratory strains of Inal68 and 70-15 [20]. In the present 
study, their alleles were found in 56, 10, and 5 out of the 

62 strains, respectively (Table 1). This was consistent with 
previous reports that many Avr genes showed presence/ 
absence polymorphisms in plant pathogens [20,23]. Not- 
ably, only 5 alleles were detected in Avr-Pia locus and 10 
in Avr-Pii locus, even after another three primer pairs 
were used, including previously reported primers [23]. 
This suggested that Avr-Pia and Avr-Pii alleles were 
present in the sampled blast strains at a very low fre- 
quency. Although P/A polymorphisms have also been re- 
ported at the Avr-Pik locus, the highest frequency (56 out 
of 62) was observed in these sampled strains. The other 
three loci, Avr-Pital, PWL2 and ACE1, have moderate fre- 
quence of P/A polymorphisms in our sampled strains 
(Table 1). Overall, pattern of P/A polymorphisms at Avr 
loci showed a clear departure from the control genes in 
the present study. 

Frequency of nucleotide polymorphisms among Avr 
genes 

Haplotype diversity can be used to measure the unique- 
ness of a particular haplotype in a given population. In 
the 7 control gene loci, no allelic diversity was detected 
in 5 gene loci among the 62 strains sampled (Table 1). 
However, diverse alleles were found in 5 of 6 Avr loci 
(haplotype diversity ranged from 0.16 to 0.96) except 
Avr-Pia, which is only present in 5 strains and has no 
diversity. The haplotype diversity reached 0.96 at the 
Avr-Pik locus, suggesting a high level of allelic diversity 
in this locus in the sampled strains. 

A recent genome-wide study showed no nucleotide sub- 
stitutions in -82% (7569 out of 9184) of orthologous gene 
pairs among 3 whole-genome sequenced blast strains 
[21]. This suggested that most of these genes were well- 
conserved in different strains. The average levels of nu- 
cleotide diversity (n) were very low in those orthologous 
pairs (the average ranged from 0.00024 to 0.0006) between 
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Table 1 Loci presence in blast strains from different regions at each locus 


Group 




Strains carrying Avr genes 1 








Haplotype 






JS 


YN 


GD 


SC 


OTs 


Total 


diversity 2 




Total numbers of strains 


14 


13 


6 


26 


3 


62 




AVR genes 


AVR-Pita1 (881 bp) 


9 


11 


3 


21 


3 


47 


0.67 




PWL2 (438 bp) 


7 


10 


6 


21 


2 


46 


0.46 




AVR-Pik (342 bp) 


13 


11 


4 


25 


3 


56 


0.96 




AVR-Pii (213 bp) 


1 


0 


1 


8 


0 


10 


0.16 




AVR-Pio (276 bp) 


3 


0 


0 


2 


0 


5 


0 




ACE1 3 (12694 bp) KS subunit 


11 


10 


5 


19 


2 


47 


0.26 




AT subunit 


12 


7 


4 


19 


1 


43 


0 


Control genes 


CAL 


14 


13 


6 


26 


3 


62 


0 




ACT 


14 


13 


6 


26 


3 


62 


0 




ITS 


14 


13 


6 


26 


3 


62 


0.69 




CHS1 


14 


13 


6 


26 


3 


62 


0.49 




PTH11 


14 


13 


6 


26 


3 


62 


0 




Hik 


14 


13 


6 


26 


3 


62 


0 




ABC2 


14 


13 


6 


26 


3 


62 


0 



^S, Jiangsu province in east of China; YN, Yunnan province in southwest of China; GD, Guangdong province in south of China; SC, Sichuan province in midwest of 
China; OTs, three strains from other regions, one from Anhui province in east of China, one from Hubei province in central China and one from Japan, respectively. 
2 Haplotype diversity calculated by Nei 1987, equations 8.4 and 8.12 but replacing 2n by n. 

3 For ACE1 gene, which is as large as 12 kb and can't be cloned as a whole, we picked 2 subunits of the gene, (3-ketoacyl synthase {KS subunit) and acyltransferase 
(A T subunit) as the tested subunits to imply the characteristics of the whole gene. From 37 strains have we successfully cloned both of the two loci. 



any pair of genomes. Similar results were also observed at 
the 7 control loci. Except fot ITS and CHS loci, no nucleo- 
tide substitution was detected at any of the 5 control loci 
so that only ITS and CHS loci are used in the diversity 
analysis; in which, the nucleotide diversity was 0.001 and 
0.0012, respectively (Table 2). However, nucleotide substi- 
tutions were detected at all 5 Avr loci. The highest levels 
of nucleotide diversity (0.005) were observed at Avr-Pital 
and Avr-Pik. Diversity was ~ 10-fold higher than the ge- 
nome-wide average [22], suggesting that the Avr genes 
retain more genetic diversity than other genes. The most 
pronounced allelic diversity in Avr genes was observed in 
strains from Sichuan Province, which is in midwestern 
China, suggesting that strains from different regions have 
different levels of diversity even at the same Avr locus. 

No frame-shift or premature stop mutations were 
found in any of the sampled Avr or control loci. This is 
in contrast to previous reports, in which frequent frame- 
shift and premature stop mutations were found in 
pathogen Avr genes and plant resistance genes [1,23,24]. 
No frequent insertion/deletion polymorphisms were ob- 
served in these strains. Only one deletion polymorphism 
was detected in the first exon region of Avr-Pital allele 
in strain SC3. It was three nucleotides in length. In 
addition, significantly more nucleotide substitutions 
were found in coding-sequence regions than in non- 
coding sequence regions at Avr-Pital and Avr-Pik loci 
(Table 2), suggesting that diversifying selection took 



place in the CDS regions of these two Avr loci. This is 
consistent with a recent study, in which different groups 
of strains are collected worldwide and the diversity of 
coding region is larger than that of non-coding region in 
each group [25] . 

However, no significant difference was detected bet- 
ween nucleotide diversity within regions and divergence 
of strains between different regions, except for Avr-Pital 
locus (Additional file 1: Table S3). In Avr-Pital alleles, 
significantly higher levels of D xy were observed than nu- 
cleotide diversity within regions. This was especially true 
of combinations of GD with other regions, suggesting 
more pronounced differentiation of blast strains between 
GD and other regions at the Avr-Pital locus. 

Non-synonymous (Ka) and synonymous (Ks) nucleotide 
substitutions at Avr loci 

In one recent study, a total of 703 gene loci (7.6%) were 
detected with Ka > Ks from 9184 orthologous gene 
pairs among three whole-genome sequenced blast strains. 
These included 6 gene pairs for which Ka/Ks > 1 and 697 
gene pairs with only non-synonymous substitutions 19 . 
However, Ka was found to be greater than Ks for 4 of the 
5 Avr loci sampled here (Table 3). This suggested that 
these Avr alleles had undergone strong positive selection. 
Especially among Avr-Pik alleles, 7 non-synonymous sub- 
stitutions and no synonymous substitution were found in 
strains from any region (Table 3). This suggested a strong 



Table 2 Nucleotide diversity of Avr and control genes 



Avr-genes Control loci 





Avr-Pital (817 bp) 




PWL2 903 bp 




Avr-Pik 1371 bp 


Avr- Pi i 213 bp 


ACE1 2193 




CHS 7 925 bp 


ITS 

771 bp 


Others 




Non-coding 250 bp 


CDS 567 bp 


Non-coding 465 


CDS 438 bp 


Non-coding 
1029 bp 


CDS 
342 bp 


CDS 213 bp 


Non-coding 
87 bp 


CDS 2106 bp 


CDS 925 bp 


All region 


All region 


JS 


0 


0.002 


0 


0.001 


0.001 


0.004 


/ 


0 


0 


0 


0.0008 


0 


GD 


0 


0.001 


0.002 


0 


0.001 


0.003 


/ 


0.0035 


0.003 


0.001 


0.0005 


0 


SC 


0.002 


0.006 


0.001 


0.001 


0.001 


0.005 


0.0003 


0.0021 


0.002 


0.001 


0.0012 


0 


YN 


0 


0 


0.001 


0 


0 


0.004 


/ 


0 


0 


0 


0.0002 


0 


All sequences 


0.002 


0.005 


0.001 


0.001 


0.001 


0.005 


0.0003 


0.001 


0.001 


0.001 


0.0012 


0 



ITS isn't a protein coding gene and doesn't have a CDS sequence. The bold style reflects that the diversity of CDS region is significantly larger than that of non-coding region. 
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Table 3 Non-synonymous (Ka) and synonymous (Ks) nucleotide substitutions at Avr and control loci 



Regions 


Avr-Pital 




PWL2 






Avr- Pi k 




ACE1-KS 






CHS 






Ks 


Ka 


Ka/Ks 


Ks 


Ka 


Ka/Ks 


Ks 


Ka 


Ka/Ks 


Ks 


Ka 


Ka/Ks 


Ks 


Ka 


Ka/Ks 


JS 


0 


0.002 


Ko>Ks 


0 


0.002 


Ko>Ks 


0 


0.005 


Ko>Ks 


0 


0 




0 


0 




GD 


0 


0.002 


Ko>Ks 


0 


0 




0 


0.004 


Ko>Ks 


0.012 


0 


0 


0 


0.001 


Ko>Ks 


SC 


0.004 


0.007 


1.75 


0 


0.001 


Ko>Ks 


0 


0.006 


Ko>Ks 


0.007 


0 


0 


0 


0.002 


Ko>Ks 


YN 


0 


0 




0 


0 




0 


0.005 


Ko>Ks 


0 


0 




0 


0 




All sequences 


0.003 


0.006 


2 


0 


0.001 


Ko>Ks 


0 


0.006 


Ko>Ks 


0.004 


0 


0 


0 


0.001 


Ko>Ks 



Avr-Pii only has 10 sequences and it's not informative enough to calculate Ks and Ka at this locus. So this locus is removed from the nonsynonymous/synonymous analysis. 
"-" means Ks and Ka both equal to zero or Ka/Ks is not able to be calculated. 



positive selection at this locus. At the Avr-Pital locus, Ka 
was found to be greater than Ks in strains from three re- 
gions, GD, JS, and SC, and no nucleotide substitutions 
were detected in strains from YN. Higher levels of nucleo- 
tide diversity were also observed at Avr-Pita and Avr-Pik 
loci, suggesting that the strong positive selection might 
drive the rapid diversification of Avr genes. 

Genetic exchanges indicated by phylogenetic trees of 
Avr alleles 

To further clarify the phylogenetic relationships among 
Avr alleles, neighbor-joining trees were constructed based 
on nucleotide variations at each locus. These trees revealed 
substantial variations from tree shapes and depth, indi- 
cating a wide range of evolutionary histories. We per- 
formed this analysis for Avr-Pita, PWL2, Avr-Pik and 
ACE1, but not for Avr-Pia and Avr-Pii due to their small 
samples (only 5 and 10 strains). The Avr-Pik and Avr-Pital 
loci were found to possess many low-frequency variants, 
but only a small number of them contained multiple al- 
leles at intermediate frequencies (Figure 1). There was also 
a higher level of non-synonymous substitutions at these 
two Avr loci. This might indicate that positive selection 
acted on these two Avr genes. 

At the PWL2 and ACE1 loci, two major phylogenetic 
subclades were observed in each tree, suggesting that a 
large number of them contained multiple alleles at in- 
termediate frequencies. Only one non-synonymous nu- 
cleotide polymorphism was found between these two 
subclades at the PWL2 locus (Figure 1), and this non- 
synonymous variation has been proven to be a key substi- 
tution for gene avirulence and virulence that results in 
recognition or loss of recognition of the pathogen by the 
host plants [17]. In other words, three types of groups, ab- 
sence, toxic, and non-toxic, were found in all of blast 
strains at this locus, and the toxic and non-toxic groups 
were clustered in two distinct clades, suggesting that these 
virulence and avirulence functional alleles might be main- 
tained by balancing selection. Similar results were ob- 
served in ACE1 locus. 

These Avr-gene phylogenetic trees showed another rele- 
vant characteristic: theoretically, the distribution of blast 



strains should show significant clustering with the geo- 
graphic structure of these strains. However, strains from 
the same region did not cluster in the same clade (Figure 1). 
Rather, they were scattered among many mixed clades, 
indicating that shared polymorphisms and haplotypes were 
widespread among different regional strains. To con- 
firm this, 26 additional Avr-Pital amino acid sequences 
collected from different countries were retrieved from 
GenBank and used to construct a NJ tree with the blast 
sequences from the present study [1]. The topology was 
found to be consistent with the phylogenic tree of se- 
quences from the present strains (Figure 1 and Additional 
file 1: Figure SI). The additional haplotypes from the U.S., 
Colombia, Egypt, India, and the Philippines were also 
scattered across the phylogenic tree, suggesting that fre- 
quent genetic exchanges across different parts of China 
and even different continents may be widespread at these 
Avr-gene loci. Interestingly,a recent study [25] has shown 
that strains from Thailand tend to cluster in the phylogenic 
tree at AVR-Pital locus, suggesting different evolutionary 
pattern or selective pressure at this locus in the strains of 
Thailand. 

Because the genome-wide genetic diversity is very 
slight between blast strains it is difficult to distinguish 
the population structure using these limited markers 
[21], The high number of nucleotide polymorphisms 
among Avr gene alleles could be used to distinguish 
genetic groups across different strains by using the 
sequences from single or multiple Avr-gene loci. A 
phylogenetic tree showing 62 rice blast strains was con- 
structed by nucleotide sequences from these 4 Avr loci. 
In this tree, each blast strain was distinguished by 
Avr- allelic sequences, and these sequences could be used 
to identify different blast strains. 

Discussion 

Contribution of presence/absence polymorphisms in 
vW-genes to the rapid adaptive evolution of rice 
blast pathogens 

To date, more than 80 rice blast resistance genes have 
been identified in various rice cultivars [26]. However, 
many resistance genes lost their effectiveness within a 
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4 
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3 

T 
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14 
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Figure 1 The phylogenic trees of Avr-genes. In this Figure, each tree represents the phylogenetic relationship of one AVR gene. The colored 
shape indicates the origin of the strain, and the number of strains is labeled behind the shape if there are more than one member at the the 
same position. In the tree of PWL2, there are only two kinds of alleles. One of them is the wild type, which is intoxical, while the other is 
inefficient and toxical. 



short period of time because of the high levels of variabil- 
ity among the Avr genes in the blast strains, suggesting 
rapid adaptation by the pathogen [18,27-29]. Previous 
studies have shown that Avr-genes might be undergoing 
frequent mutational events, including spontaneous dele- 
tions, nucleotide substitutions, and inactivation by TE in- 
sertions, which can lead to loss of avirulence [1,27]. A 



genome-wide survey showed that about half of the charac- 
terized blast Avr-genes, including Avr-Pik, Avr-Pii, and 
Avr-Pia, were absent in the assembled blast 70-15 
genome, and the presence/absence polymorphisms might 
be the major mechanisms in evolution of Avr-genes [20]. 

Despite repeated efforts with different primer combi- 
nations, the amplification of the six Avr-genes evaluated 
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in this study failed in many strains (average on 41% and 
ranging from 97% at Avr-Pik locus to 92% at Avr-Pia 
locus), but the PCR-amplified products and sequences of 
all control genes were obtained from all 62 blast strains 
(Table 1). This suggested that presence/absence poly- 
morphisms might be prevalent at Avr loci in these sam- 
pled strains. Previous studies have shown that Avr-Pita, 
Avr-Pia, Avr-Pik, and Avr-Pii are highly variable in their 
genome location and have undergone multiple transloca- 
tions in their genomes [29]. 

In order to explain the prevalence of presence/absence 
polymorphisms and multiple translocations across the 
genome in these Avr-genes, the genomic composition of 
the flanking sequences was surveyed in the area around 

4 Avr-genes and 5 control genes. As expected, a large 
number of repetitive sequences with at least 5 copies 
were found in the flanking sequences of 3 Avr-genes, 
specifically Avr-Pita, Avr-Pik, and PWL2 (Figure 2), sug- 
gesting that most Avr-genes are closely associated with 
diverse repeated sequences, including transposable ele- 
ments. However, the flanking sequences surrounding the 

5 randomly selected non-Avr genes (control genes) were 



non-repetitive (Figure 2). This may partially explain why 
frequent presence/absence polymorphisms and translo- 
cations across the genome are detected in Avr-genes. 
These results also indicated that the frequent deletion 
of avirulence genes (presence/absence polymorphisms) 
might be the primary mechanism underlying the rapid 
adaptation of pathogens toward virulence to their host 
plants. The translocation of Avr-genes may be associated 
with frequent recovery through transfer from other indi- 
viduals, suggesting that the pronounced mobility of Avr 
might be the key mechanism underlying the rapid adap- 
tation toward plant i?-genes [29] . 

Levels of non-synonymous substitutions and positive 
selection 

To identify Avr-genes, a total of 1032 of predicted se- 
creted protein genes in strain of 70-15 were tested using 
EcoTilling experiments in 46 diverse Oryza field isolates. 
Approximately 78% of them showed no nucleotide poly- 
morphisms [20]. A similar result was observed in an- 
other study, in which no nucleotide substitutions were 
detected in -82% (7569 out of 9184) of orthologous gene 



target gene 
replication number 

■ 

■ >5 

0 30k 
I I 

ACE-1 



0 10k 

I I 

AVR-Pita 



AVR-Pik 



PWL2 



calmodulin 



actin 



CHS 



hik 



pthll 



Figure 2 The genomic composition of the flanking sequences. The copy numbers of the flanking sequences of different loci are shown in 
this Figure. Each line displays the 10 kb flanking sequence of a gene (for ACE 7 the flanking length is 30 kb as this gene is too long). The green 
region shows the position of the target gene, the red region indicates that this part of sequence has more than 5 copies in the whole genome 
while grey region only has one copy. Blue region has copy numbers ranging from 2 to 5. 
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pairs among three whole-genome sequenced blast strains 
[21]. However, among the 6 well-characterized Avr-genes 
evaluated in this study, nucleotide substitutions were de- 
tected in 5 out of 6 Avr loci. In Avr-Pital and 
Avr-Pik, the nucleotide diversity (0.005) was ~ 10-fold 
more pronounced than the genome-wide average, sug- 
gesting that the Avr-genes retain considerably more 
genetic diversity than other genes in this genome. 

In the 9184 orthologous gene pairs, Ka was found to 
be greater than Ks for only 703 gene loci (-7.6%). This 
included 6 pairs of genes for which Ka/Ks > 1 and 697 
pairs with only non-synonymous substitutions [21]. 
However, among the Avr-genes sampled here, Avr-Pital, 
Avr-Pik, and PWL2 showed an excess of non-synony- 
mous substitutions for which Ka > Ks (Table 3). This 
suggested that these Avr alleles had undergone strong 
positive selection. 

At the Avr-Pital and Avr-Pik loci, high levels of 
nucleotide substitutions and presence/absence polymor- 
phisms were detected, and most variations in the DNA 
sequence were observed in the exon regions, most of 
them in the form of amino acid substitutions (Table 3). 
Only 7 nonsynonymous polymorphic sites were found in 
Avr-Pik alleles, suggesting that these Avr-genes are under 
positive selection and that the rapid variations in these 
alleles might be responsible for defeating race-specific 
resistance. 

In one recent study, 5 Avr-Pik alleles were observed. 
They had 5 nonsynonymous polymorphic sites in 21 
Japanese blast isolates [30]. The three major alleles, D, E, 
and A, are likely to have different levels of toxity and 
arose at different time. A is likely to be newly derived. 
Strain A often escapes from the surveillance of Pikp and 
Pik*, given that the ancestral type D fails to overcome 
the Pik alleles [30]. The authors propose that the Avr-pik 
and Pik are locked into an arms race driven by co- 
evolution. And these are also the three major alleles in 
our result. What's more, we also find some other alleles. 

Two additional amino acid replacements, 2-H and 103-G 
(marked in blue in Additional file 1: Figure S2) were found 
in these strains. The replacement of arginine by histidine 
at the second position was found to occur within the 
signal peptide region. This region is highly significant to 
the secretion of Avr-Pik. This replacement may influence 
the interactions between Avr- and i?-proteins in the plant 
hosts. This is consistent with the results of a previous 
study that showed that OT2, a strain containing this 
replacement, showed stronger pathogenicity than the 
strains of Avr-Pik- A. Some new combinations of amino 
acid alleles were also observed, indicating pronounced 
diversity among protein sequences. The rapid evolution of 
new strains might be attributed to rapid loss of function of 
avirulence effector genes that correspond to the resistance 
genes in a gene-for-gene manner [4,29] . 



At the PWL2 locus, only one non-synonymous sub- 
stitution was detected. It has been proven to be a key 
substitution for avirulence or virulence function. Inter- 
estingly, two candidate 7?-genes might be responsible for 
the recognition of the Avr-gene of PWL2 (Additional 
file 1: Table S4). These genes are resistant to most of the 
strains with the PWL2 genotype and are susceptible to 
the strains with the PWL2D genotype. 

Estimation of the population genetics of blast strains 

Though high-throughput sequencing is a highly valuable 
form of DNA sequencing, genome-wide genetic diversity 
between rice blast strains is so slight that the cost of 
using high-throughput sequencing to determine popula- 
tion structure becomes prohibitive [20,21]. The data col- 
lected in the current work showed that the high level of 
nucleotide polymorphisms among Avr-gene alleles could 
be used to distinguish genetic groups among different 
strains by using the sequences from single Avr-gene loci 
or multiple Avr-gene loci, and each blast strain was dis- 
tinguished using Avr- allelic sequences at the 5 of 6 Avr 
loci (Additional file 1: Figure S3). 

Conclusions 

In the study of 6 Avr-genes in 62 Magnaporthe oryzae 
strains, a prevalence of presence/absence polymorphisms 
is observed, which may be crucial for the infestants to 
escape from the immune systems of the hosts. This may 
partly be explained by the fact that most Avr-genes are 
closely associated with diverse repeated sequences. 
Relatively high rate of nonsynonymous replacements are 
discovered and some of the polymorphisms have been 
proved responsible for the alteration of the gene func- 
tion. The frequent present/absent polymorphisms and 
SNP of Avr-genes could be used to distinguish genetic 
groups between different strains. 

Methods 

Strains of Magnaporthe oryzae 

A total of 62 different M. oryzae strains from different 
parts of China were collected and used in Avr-gene amp- 
lification (Additional file 1: Table SI). These strains rep- 
resent most of the major areas of rice production in 
China reasonably well. 

DNA extraction and PCR amplification 

Each strain of M. oryzae was first cultured on PDA 
medium in tubes and then transferred to PDB medium 
in conical flasks to harvest hyphae. They were kept in a 
shaking incubator at a speed of 180 rpm at 26°C for 
48 h, after which the flasks were full of balls of hyphae. 
These were removed by filtration and then baked out 
with vacuum dryer at 30°C. The dried hyphae were 
ground to powder in mortar with liquid nitrogen. The 
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powder was transferred to microtubes and processed 
with a DNeasy Plant Mini Kit from Qiagen to produce 
the DNA template. 

To determine the sequence of these loci, a standard 
PCR process was carried out as follows: Primers were 
designed to make products 500-1000 bp in length that 
included the target gene of the coding sequences. When 
sequencing failed, more primers were brought out and 
different conditions were tried. All primers are listed in 
Additional file 1: Table S2. The reaction was carried out 
in a 20 ul LA Taq system containing 2.6 ul of template, 
3.2 ul of dNTP (2.5 mM each, Takara), 10 ul of GC buf- 
fer (2X, Takara), 0.2 ul of LA Taq (5 u/uL, Takara), 2 ul 
of upstream and downstream primers at concentration 
of 2 umol/L. PCR products were sequenced on an 
ABI3100A automated sequencer. 

Data analysis 

The amino acid sequences were first aligned using the 
MUSCLE software package with default options [31]. The 
resulting amino acid sequence alignments were then used 
to guide the alignments of the corresponding nucleotide 
coding sequences. Based on the nucleotide alignment 
results, phylogenetic trees were constructed using the 
bootstrap neighbor-joining method with Kimura two- 
parameter model in MEGA [32]. The stability of internal 
nodes was assessed by bootstrap analysis with 1,000 repli- 
cates. Haplotype diversity, nucleotide diversity with the 
Jukes and Cantor correction, and nonsynonymous and 
synonymous substitutions were calculated with DnaSP 
[33]. To determine the number of replications of the re- 
gion adjacent to target loci, a 10 kb sequence containing 
the target locus from the 70-15 genome was selected 
(a 30 kb sequence was selected for ACE1) and used to find 
all copies of each fragment in any part of the blast 
genome. 

Additional file 
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